Enhancements to a Morphological Generator to Capture Arabic Morphology

نویسندگان

  • VIOLETTA CAVALLI-SFORZA
  • ABDELHADI SOUDI
چکیده

We describe an enhanced version of the MORPHE tool, a morphological analyzer/generator designed to interface with a knowledge-based machine translation system. MORPHE uses a hierarchy (tree structure) to relate various morphological forms to each other based on common and distinctive features. Transformational rules are attached to the leaf nodes of the hierarchy. In generation, MORPHE takes as input a feature structure and pushes it through the hierarchy, which acts as discrimination net. When a leaf node is reached, MORPHE applies the attached rule. Each rule may contain several mutually exclusive clauses, each of which attempts to match a pattern against the base string contained in the feature structure and, if the match is successful, applies operators to the string to produce a transformed string. Our enhancements to MORPHE were motivated by attempting to use the tool to generate Arabic morphology. The non-concatenative morphology typical of Semitic languages has spurred the development of sophisticated formalisms and computational engines, as well as produced brute force approaches. In this paper we show how the relatively straightforward formalism used in MORPHE can be extended in simple ways to produce an elegant treatment of Arabic morphology that captures the inflectional regularities of the language. The result is the ability to describe Arabic morphology, as well as the morphology of any language whose word forms undergo stem changes, using a set of rules that contains minimal duplication, is easy to understand and maintain, and is useful for language learning as well as for machine translation applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IMORPHĒ: An Inheritance and Equivalence Based Morphology Description Compiler

IMORPHĒ is a significantly extended version of MORPHĒ, a morphology description compiler. MORPHĒ’s morphology description language is based on two constructs: 1) a morphological form hierarchy, whose nodes relate and differentiate surface forms in terms of the common and distinguishing inflectional features of lexical items; and 2) transformational rules, attached to leaf nodes of the hierarchy...

متن کامل

Unsupervised Induction of Arabic Root and Pattern Lexicons using Machine Learning

We describe an approach to building a morphological analyser of Arabic by inducing a lexicon of root and pattern templates from an unannotated corpus. Using maximum entropy modelling, we capture orthographic features from surface words, and cluster the words based on the similarity of their possible roots or patterns. From these clusters, we extract root and pattern lexicons, which allows us to...

متن کامل

MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects

We present MAGEAD, a morphological analyzer and generator for the Arabic language family. Our work is novel in that it explicitly addresses the need for processing the morphology of the dialects. MAGEAD performs an on-line analysis to or generation from a root+pattern+features representation, it has separate phonological and orthographic representations, and it allows for combining morphemes fr...

متن کامل

Morphological Analysis and Generation for Arabic Dialects

We present MAGEAD, a morphological analyzer and generator for the Arabic language family. Our work is novel in that it explicitly addresses the need for processing the morphology of the dialects. MAGEAD provides an analysis to a root+pattern representation, it has separate phonological and orthographic representations, and it allows for combining morphemes from different dialects.

متن کامل

A Unification Based Approach to the Morphological Analysis and Generation of Arabic

In this paper, we present a powerful Arabic morphological analyzer and generator. The approach employs finite state machines enriched with unification capability. The presented system is used as a component in both statistical and rule based machine translation systems. We give detailed illustrations on how we handle nominal and verbal morphology in Arabic. Issues regarding derivational morphol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002